AITopics | bayesian model comparison

Collaborating Authors

bayesian model comparison

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Trustworthy Amortized Bayesian Model Comparison

Kucharský, Šimon, Mishra, Aayush, Habermann, Daniel, Radev, Stefan T., Bürkner, Paul-Christian

arXiv.org Machine LearningAug-29-2025

Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the reliability of neural surrogates deteriorates when simulation models are misspecified - the very case where model comparison is most needed. Thus, we supplement simulation-based training with a self-consistency (SC) loss on unlabeled real data to improve BMC estimates under empirical distribution shifts. Using a numerical experiment and two case studies with real data, we compare amortized evidence estimates with and without SC against analytic or bridge sampling benchmarks. SC improves calibration under model misspecification when having access to analytic likelihoods. However, it offers limited gains with neural surrogate likelihoods, making it most practical for trustworthy BMC when likelihoods are exact.

artificial intelligence, likelihood, machine learning, (18 more...)

arXiv.org Machine Learning

2508.20614

Country:

Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry:

Transportation (0.71)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

Savage-Dickey density ratio estimation with normalizing flows for Bayesian model comparison

Lin, Kiyam, Polanska, Alicja, Piras, Davide, Mancini, Alessio Spurio, McEwen, Jason D.

arXiv.org Machine LearningJun-6-2025

A core motivation of science is to evaluate which scientific model best explains observed data. Bayesian model comparison provides a principled statistical approach to comparing scientific models and has found widespread application within cosmology and astrophysics. Calculating the Bayesian evidence is computationally challenging, especially as we continue to explore increasingly more complex models. The Savage-Dickey density ratio (SDDR) provides a method to calculate the Bayes factor (evidence ratio) between two nested models using only posterior samples from the super model. The SDDR requires the calculation of a normalised marginal distribution over the extra parameters of the super model, which has typically been performed using classical density estimators, such as histograms. Classical density estimators, however, can struggle to scale to high-dimensional settings. We introduce a neural SDDR approach using normalizing flows that can scale to settings where the super model contains a large number of extra parameters. We demonstrate the effectiveness of this neural SDDR methodology applied to both toy and realistic cosmological examples. For a field-level inference setting, we show that Bayes factors computed for a Bayesian hierarchical model (BHM) and simulation-based inference (SBI) approach are consistent, providing further validation that SBI extracts as much cosmological information from the field as the BHM approach. The SDDR estimator with normalizing flows is implemented in the open-source harmonic Python package.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2506.04339

Country:

Europe > Switzerland (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

An inferential measure of dependence between two systems using Bayesian model comparison

Marrelec, Guillaume, Giron, Alain

arXiv.org Machine LearningDec-10-2024

We propose to quantify dependence between two systems $X$ and $Y$ in a dataset $D$ based on the Bayesian comparison of two models: one, $H_0$, of statistical independence and another one, $H_1$, of dependence. In this framework, dependence between $X$ and $Y$ in $D$, denoted $B(X,Y|D)$, is quantified as $P(H_1|D)$, the posterior probability for the model of dependence given $D$, or any strictly increasing function thereof. It is therefore a measure of the evidence for dependence between $X$ and $Y$ as modeled by $H_1$ and observed in $D$. We review several statistical models and reconsider standard results in the light of $B(X,Y|D)$ as a measure of dependence. Using simulations, we focus on two specific issues: the effect of noise and the behavior of $B(X,Y|D)$ when $H_1$ has a parameter coding for the intensity of dependence. We then derive some general properties of $B(X,Y|D)$, showing that it quantifies the information contained in $D$ in favor of $H_1$ versus $H_0$. While some of these properties are typical of what is expected from a valid measure of dependence, others are novel and naturally appear as desired features for specific measures of dependence, which we call inferential. We finally put these results in perspective; in particular, we discuss the consequences of using the Bayesian framework as well as the similarities and differences between $B(X,Y|D)$ and mutual information.

dependence, information, mutual information, (16 more...)

arXiv.org Machine Learning

2412.06478

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Czechia > Prague (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Evidence Networks: simple losses for fast, amortized, neural Bayesian model comparison

Jeffrey, Niall, Wandelt, Benjamin D.

arXiv.org Artificial IntelligenceJan-10-2024

Evidence Networks can enable Bayesian model comparison when state-of-the-art methods (e.g. nested sampling) fail and even when likelihoods or priors are intractable or unknown. Bayesian model comparison, i.e. the computation of Bayes factors or evidence ratios, can be cast as an optimization problem. Though the Bayesian interpretation of optimal classification is well-known, here we change perspective and present classes of loss functions that result in fast, amortized neural estimators that directly estimate convenient functions of the Bayes factor. This mitigates numerical inaccuracies associated with estimating individual model probabilities. We introduce the leaky parity-odd power (l-POP) transform, leading to the novel ``l-POP-Exponential'' loss function. We explore neural density estimation for data probability in different models, showing it to be less accurate and scalable than Evidence Networks. Multiple real-world and synthetic examples illustrate that Evidence Networks are explicitly independent of dimensionality of the parameter space and scale mildly with the complexity of the posterior probability density function. This simple yet powerful approach has broad implications for model inference tasks. As an application of Evidence Networks to real-world data we compute the Bayes factor for two models with gravitational lensing data of the Dark Energy Survey. We briefly discuss applications of our methods to other, related problems of model comparison and evaluation in implicit inference settings.

artificial intelligence, evidence network, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/ad1a4d

2305.11241

Country:

South America > Chile (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Proximal nested sampling with data-driven priors for physical scientists

McEwen, Jason D., Liaudat, Tobías I., Price, Matthew A., Cai, Xiaohao, Pereyra, Marcelo

arXiv.org Machine LearningJul-28-2023

Proximal nested sampling was introduced recently to open up Bayesian model selection for high-dimensional problems such as computational imaging. The framework is suitable for models with a log-convex likelihood, which are ubiquitous in the imaging sciences. The purpose of this article is two-fold. First, we review proximal nested sampling in a pedagogical manner in an attempt to elucidate the framework for physical scientists. Second, we show how proximal nested sampling can be extended in an empirical Bayes setting to support data-driven priors, such as deep neural networks learned from training data.

artificial intelligence, machine learning, proximal nested, (18 more...)

arXiv.org Machine Learning

2307.00056

Country:

Europe > United Kingdom (0.05)
North America > United States > New York (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Bayesian Model Comparison by Monte Carlo Chaining

Neural Information Processing SystemsApr-6-2023, 18:09:05 GMT

The techniques of Bayesian inference have been applied with great success to many problems in neural computing including evaluation of regression functions, determination of error bars on predictions, and the treatment of hyper-parameters. However, the problem of model comparison is a much more challenging one for which current techniques have significant limitations. In this paper we show how an extended form of Markov chain Monte Carlo, called chaining, is able to provide effective estimates of the relative probabilities of different models. We present results from the robot arm problem and compare them with the corresponding results obtained using the standard Gaussian approximation framework. In a Bayesian treatment of statistical inference, our state of knowledge of the values of the parameters w in a model M is described in terms of a probability distribution function. Initially this is chosen to be some prior distribution p(wIM), which can be combined with a likelihood function p( Dlw, M) using Bayes' theorem to give a posterior distribution p(wID, M) in the form

artificial intelligence, bayesian model comparison, machine learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.80)

Add feedback

Model Comparison in Approximate Bayesian Computation

Boelts, Jan

arXiv.org Machine LearningMar-15-2022

A common problem in natural sciences is the comparison of competing models in the light of observed data. Bayesian model comparison provides a statistically sound framework for this comparison based on the evidence each model provides for the data. However, this framework relies on the calculation of likelihood functions which are intractable for most models used in practice. Previous approaches in the field of Approximate Bayesian Computation (ABC) circumvent the evaluation of the likelihood and estimate the model evidence based on rejection sampling, but they are typically computationally intense. Here, I propose a new efficient method to perform Bayesian model comparison in ABC. Based on recent advances in posterior density estimation, the method approximates the posterior over models in parametric form. In particular, I train a mixture-density network to map features of the observed data to the posterior probability of the models. The performance is assessed with two examples. On a tractable model comparison problem, the underlying exact posterior probabilities are predicted accurately. In a use-case scenario from computational neuroscience -- the comparison between two ion channel models -- the underlying ground-truth model is reliably assigned a high posterior probability. Overall, the method provides a new efficient way to perform Bayesian model comparison on complex biophysical models independent of the model architecture.

posterior, posterior probability, probability, (17 more...)

arXiv.org Machine Learning

doi: 10.13140/RG.2.2.20085.88807

2203.11276

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Input complexity and out-of-distribution detection with likelihood-based generative models

Serrà, Joan, Álvarez, David, Gómez, Vicenç, Slizovskaia, Olga, Núñez, José F., Luque, Jordi

arXiv.org Machine LearningSep-25-2019

On the other side of the spectrum, we observe that the data set with a lower log-likelihood is Noise, a data set of uniform random images, followed by TrafficSign and TinyImageNet; both featuring colorful images with nontrivial background. Such ordering is perhaps more clear by looking at the average log-likelihood of each data set (Appendix D). If we think about the visual complexity of the images in those data sets, it would seem that log-likelihoods tend to grow when images become simpler and with less information or content. To further confirm the previous observation, we design a controlled experiment where we can set different decreasing levels of image complexity. We train a generative model with some data set, as before, but now compute likelihoods of progressively simpler inputs. Such inputs are obtained by average-pooling the uniform random Noise images by factors of 1, 2, 4, 8, 16, and 32, and re-scaling back the images to the original size by nearest-neighbor up-sampling. Intuitively, a noise image with a pooling size of 1 (no pooling) has the highest complexity, while a noise image with a pooling of 32 (constant-color image) has the lowest complexity.

compressor, generative model, likelihood, (14 more...)

arXiv.org Machine Learning

1909.1148

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States (0.04)

Genre: Research Report > Experimental Study (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.66)

Add feedback

Bayesian model comparison with un-normalised likelihoods

Everitt, Richard G., Johansen, Adam M., Rowing, Ellen, Evdemon-Hogan, Melina

arXiv.org Machine LearningJan-20-2016

Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalising constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes' factors (BFs) for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s11222-016-9629-2

1504.00298

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Bayesian Model Comparison by Monte Carlo Chaining

Barber, David, Bishop, Christopher M.

Neural Information Processing SystemsDec-31-1997

Neural Computing Research Group Aston University, Birmingham, B4 7ET, U.K. http://www.ncrg.aston.ac.uk/ Abstract The techniques of Bayesian inference have been applied with great success to many problems in neural computing including evaluation of regression functions, determination of error bars on predictions, and the treatment of hyper-parameters. However, the problem of model comparison is a much more challenging one for which current techniques have significant limitations. In this paper we show how an extended form of Markov chain Monte Carlo, called chaining, is able to provide effective estimates of the relative probabilities of different models. We present results from the robot arm problem and compare them with the corresponding results obtained using the standard Gaussian approximation framework. Initially this is chosen to be some prior distribution p(wIM), which can be combined with a likelihood function p( Dlw, M) using Bayes' theorem to give a posterior distribution p(wID, M) in the form (ID M) p(Dlw,M)p(wIM) (1) p w, p(DIM) where D is the data set. Predictions of the model are obtained by performing integrations weighted by the posterior distribution.

approximation, bayesian model comparison, model evidence, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback